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PREFACE 


—o 


THE primary aim of this booklet is to present a number of methods 
for the approximate solution of equations. Their practical value is 
indisputable, yet they are little studied in high school or college. It 
will often happen, therefore, that someone who may even have 
majored in mathematics will find it difficult to solve the simplest of 
transcendental equations. Not only engineers, but other specialists 
need to solve equations, and familiarity with methods of approximate 
solution of equations is useful for the high school and college student. 

Since most of the methods for the approximate solution of equa- 
tions are connected with the concept of the derivative, we have found 
it necessary to introduce this concept. We have based our treatment 
on an appeal to geometry, however. The reader therefore needs no 
more background than is provided by high school mathematics. 

In compiling this book the author made use of a lecture he gave 
to ninth- and tenth-grade students in the school mathematics circle 
at Moscow State University. 

The content of this lecture was adopted by S. I. Shvartsburd, a 
Moscow high school teacher, for work outside class hours with his 
ninth-grade pupils. The author thanks him for his solutions to 
some of the problems used for this book. 

The author expresses his deep gratitude to V. G. Boltyansku, 
whose suggestions improved the first version of the manuscript. 


N. YA. VILENKIN 
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INTRODUCTION 


THE school mathematics program devotes a great deal of time to 
solving equations and systems of equations. One first meets equa- 
tions of the first degree, and systems of such equations. Then 
quadratic, biquadratic, and irrational equations appear. Finally, 
one learns about exponential, logarithmic, and _ trigonometric 
equations. 

This concentration on equations is deliberate. It is explained by 
the important role played by equations in applied mathematics. 
Whatever area of application we may think of, we will almost 
always find that the final answer to some problem is to be obtained 
by solving an equation or a system of equations. 

As an example, we often need equations for the solution of school 
physics problems. Consider this problem: A stone is dropped in a 
well. Find the depth of the well given that you hear the sound of the 
stone’s hitting bottom after T sec. 

If we denote the depth of the well by x, we have the following 
equation for determining it: 


Mg)te-* 


where v is the speed of propagation of sound in air, the first term is 
the time the stone takes to the bottom, and the second is the time the 
sound takes to reach us. This is an rrational equation. If we write 
4/x = y we can rewrite the equation in the form 


y 2 _ 
wtf (g)e-F=e 


which can be solved with the quadratic formula. 

Equations may also be used in solving geometric problems. 
Suppose we wish to divide a segment AB in the golden ratio, that is, 
to find a point C between A and B such that AB/AC = AC/CB. 
This leads to the quadratic equation 

rw+ik—-P=0, 
where / is the length of AB and x of AC. 
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We are led to a more complex equation when we investigate the 
problem of trisecting an angle «. The equation is 


4x3 — 3x — cosa = 0, 


where x = cos 4a. In algebra courses it is sometimes shown that a 
formula for the solution of cubic equations does exist (see formula 
(3) below). 

We often come across problems in physics that lead to more 
complicated equations, however—equations for which no formulas 
for the solutions are given either in school or in college. Consider a 
steel beam clamped firmly at each end. If we strike it, it will start 
oscillating transversely. It is shown in mathematical physics that if 
we are to find the frequency of these oscillations, we must first solve 
the equation 


2 
e@ + er poms 


~ COS xX’ 
where e = 2.71828... . 

No methods are taught in school for solving such an equation. 
This is not explainable, as one might suppose, by the limited time 
available for mathematics in school. Formulas for the solution of 
equations such as (1), in the high school sense of the word, simply do 
not exist. Let us put our assertion more precisely. 

We say that a formula exists for the solution of an equation if 
we may express its roots in terms of the constants appearing in the 
equation by means of arithmetic operations, the exponential, 
logarithmic, trigonometric, and inverse trigonometric functions. In 
this sense the quadratic equation x? — 2px + q has the following 
formula for its solution: 


x= p+ /(p? — 4). (2) 


A formula exists for the solution of the general cubic equation 
Ayx® + a,x? + aox + az = 0 (with ay ~ 0). On substituting 


_ J 
(a,)* 3a 


we may reduce the general cubic to the form 


(1) 


x 


y? + 3py — 2g =0, 
which has the real root 


yelqt+ V@—-—p 4+ Ig-— vV@+ pI. (3) 
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The practical application of formula (3) is sometimes difficult, 
however, and may require the use of complex numbers. 

There is a formula for the solution of the general quartic (or 
biquadratic) equation, but it is so complicated that we will not 
produce it. 

Equations of the fifth and higher degrees are not so simple. In 
1826 the Norwegian mathematician Abel proved that no formula 
exists for the solution of the general algebraic equation 

Ax +taxt+...+a,=0 
of degree n greater than 4. Only in special cases do formulas exist 
for solving algebraic equations of high degree. 

If mathematicians limited themselves to the study of equations 


with exact solutions given by a formula, we might hear such a 
conversation between a mathematician and an engineer: 


ENGINEER: I have been designing a piece of equipment, and this 
equation came up (he shows the mathematician the equation). I must 
solve this equation quickly. The design has to be ready in a month. 

MATHEMATICIAN: I would be glad to help you, but there is no formula 
for the solution of this sort of equation. 

ENGINEER: Well, could you work out a formula for me? 

MATHEMATICIAN: I wouldn’t even try. It was proved long ago that no 
formulas exist for the solution of this kind of equation. 

After such a conversation the engineer’s opinion of the powers 
of mathematics would fali sharply. Fortunately, such conversations 
do not occur. The engineer usually does not need a formula to 
solve an equation. He needs only a root of the equation accurate 
to within a certain error, and whether a formula or some other 
method is used is a minor matter, for he will usually only need the 
formula to calculate the root to the necessary accuracy. 

Imagine that a formula is known for the solution of a certain 
equation, and that on applying it the engineer finds a root 
x =3+ 1/13. It is clear that this solution cannot be used as it 
stands (after all, you could hardly ask a machinist to turn a shaft of 
diameter 3 + 4/13 in.). For practical purposes you have to express 
4/13 as a decimal fraction, taking a number of decimal places 
appropriate to the tolerance to which you are working. 

Thus the engineer will be satisfied if the mathematician gives him 
any method of calculating roots of an equation to the required 
accuracy. Many methods have been developed in mathematics for 
the approximate solution of equations. The purpose of this book 
is to describe some of them. 
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SUCCESSIVE 
APPROXIMATIONS 


Most of the methods for the approximate solution of equations are 
based on the idea of successive approximations. This idea is applied 
not only to the solution of equations, but also to the solution of a 
wide variety of practical problems. 

A gunner would use a process of successive approximation. If he 
wants to hit a target, he starts by taking aim and firing. If he misses 
and can see where his shell exploded, he can make appropriate 
corrections in his aim and fire again. After a number of such 
‘‘approximations,”’ the aim will be good enough to hit the target. 

Sometimes successive approximations are required even to deter- 
mine the point of aim. Suppose you have an antiaircraft gun at the 
point O and you are firing at a flying airplane (Fig. 1). If you aim 

Ao A, A, 


os 


2 


Sa SO © eee 


O 
Fig. 1 


at the point Ay, which is where the airplane is, you will miss, since 
the plane will have moved to the point A, during the time the shell 
is in the air. It is easy to calculate the position of the point Aj, 
knowing the speed of your shell and the plane. If instead of aiming 
at Ay you aim at A, in the first place, you may still miss. You are 
now aiming at an angle, so that the shell will take longer to reach 
A, than it would to reach Ay. In the extra time the plane will have 
reached A,. The distance from A, to A, will be much less than the 
distance from Ay to A,, however. To make the shot still closer, you 
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could aim at A., or go on to calculate that by the time your shell 
reaches A, the plane will be at As, and aim at A, instead. After a 
number of such approximations you will be able to aim the shell so 
that it comes close enough to hit the plane (say, within 10 ft). 

The method of successive approximations is used for the solution 
of many other problems as well. 

Suppose you have to transport sand from a number of sand 
quarries A,,..., A, to a number of building sites B,,.. ., B,,. 
Suppose the quarry A, produces a; tons of sand a day and the 
building site B, requires b, tons of sand a day. Suppose, finally, 
that the cost of transporting 1 ton of sand from quarry A, to site 
B, is c,,. (This quantity will depend on the distance between 4A, 
and B,, the state of the road between them, and other factors.) 

To determine our plan of delivery we draw up Table 1. Here 


TABLE 1 
B, B, Bi 
A, X11 X32 Xim 
A, Xe1 Xe Xom 
A, Xni Xneo e 8 @ Xnm 


x;, denotes the quantity of sand to be taken from A, to B,, each day. 
It is clear that the x,, have to satisfy the following relationship: 


Xz $ XQ te. + Xjim SG; 
since you cannot take more than a; tons of sand from A, in a day, 
and 

Xp + Xo, Fe + Xue = Oy 
since b, tons of sand are required at B, each day. 


According to Table 1, the total daily cost of transporting sand is 
given by 


C= Cy Xyy + CX. ee eH Cn Xin 
+ Co1Xo1 A CogXag tb.» TF ConXan 
+ CmiX* m1 5 Cmo% m2 + oe 8 + Cinn® mn (4) 
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We must adopt a plan that makes C minimal. As a first attempt, we 
might assign to the quarry A, the nearest building site to it, say, 
B,. If A, produces more sand than B, requires, we assign another site 
B,, to A,, choosing the next closest one. After a number of such steps 
we will exhaust the daily productivity of A,. If, on the other hand, B;, 
requires more sand than A, produces, we assign another quarry A, 
to B,, choosing the next closest one to B;. Continuing in this sort of 
way we shall finally have assigned quarries to every site, and sites to 
every quarry. 

The plan we have devised in this way may not be the best, how- 
ever, as we are finally left with only a few sites, and they may be 
very far from the remaining quarries. Some of the sites that we 
assigned to the first quarries will have to be reassigned to other 
quarries. 

Methods of changing plans in order to reduce the total cost are 
dealt with in the branch of mathematics called linear programming. 
A booklet on the subject has been published in this series. 

After a number of successive changes in plan, made in accor- 
dance with the schemes devised in linear programming, we will 
reach a plan for which the sum Cis a minimum, or differs little from it. 

In general, in devising a plan, a timetable, or the like, we start 
with a crude approximation, and then improve it step by step until 
the required result is achieved. 

The machining of a part in a metal shop or a factory can be 
regarded as a sequence of successively better approximations to the 
required shape. First one takes a crude approximation—a casting 
or some other stock. This stock is machined to a form approximating 
that of the required part. It is then taken to another machine which 
works to greater accuracy. After a number of steps (successive 
approximations), the required part emerges. 


+ Barsov, What is Linear Programming ?, Boston: D. C. Heath, 1963, or 
see A. Charnes, Lectures on the Mathematical Theory of Linear Program- 
ming (Part II of W. W. Cooper, An Introduction to Linear Programming, 
New York, Wiley, 1953). 
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ACHILLES AND THE 
TORTOISE 


THE first discussion of successive approximations is found in the 
work of Zeno of Elea, who lived around 500 B.c. This philosopher 
tried to prove that there is no such thing as motion. Zeno’s argu- 
ment ran as follows: Achilles, the swiftest of the Greeks, will never 
catch up in a race with a tortoise. Suppose that Achilles starts out 
1000 paces behind the tortoise, and that he runs at a rate of 10 paces 
a second, while the tortoise crawls 1 pace a second. After 100 sec 
Achilles will have covered the 1000 paces separating him from the 
tortoise. But during this period the tortoise will have crawled 100 
paces further. After another 10 sec Achilles will have covered these 
100 paces, but the tortoise will still be 10 paces ahead. To cover these 
10 paces, Achilles needs only | sec, but meanwhile the tortoise has 
advanced ] pace more. Thus the tortoise will always be ahead of 
Achilles. But this is ridiculous. The only conclusion is that motion 
is an illusion. 

Of course, Zeno’s argument is a brilliant paradox, but it does not 
prove that motion cannot exist. We shall not be concerned with the 
philosophical points raised by the paradox, but regard it, rather, asa 
method of successive approximation to the place and time where 
Achilles catches up with the tortoise. Any schoolboy can calculate 
this easily: if x is the required time, we form the equation 


1000 = 10x — x (5) 
and deduce 
1000 
xX = —9— sec = 1112 sec. 

To translate Zeno’s approximating argument into mathematical 
terms we proceed as follows: suppose we have found an approxi- 
mate solution x, of our problem. By the conditions of the problem 
the tortoise will have crawled x, paces in this time. Achilles takes 
x,/10 sec to run x, paces. Furthermore he takes 100 sec to run the 
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first 1000 paces that originally separated him from the tortoise. 
So to arrive at the place where the tortoise was after x, sec, Achilles 
needs x,,,, sec, where 


x 
Xn = 100 + 70° (6) 


Setting x) = 0, we obtain the successive numbers x, = 100, 
X, = 110, x, = 111, x, = 111.1,. . ., which are the same numbers 
that turned up in Zeno’s argument. As n increases the numbers 
x, approach closer and closer to the exact solution x = 111% sec 
of equation (5). 

Let us note that the formula (6) is closely connected with equation 
(5), for we may rewrite the equation in the form 


x= 100 + =. (7) 


On substituting the value x) = 0 for x in the right side of (7) we 
find x, = 100; on substituting this value for x in the right side we 
find x, = 110, and so on. 

In the example given here the numbers x, xX, .. ., Xn, 
approached the solution 1113 of equation (5). Had Achilles been 
racing an antelope instead of a tortoise, this process of successive 
approximation would have failed. Suppose the antelope runs at a 
rate of 20 paces a second. Then our equation would be 


1000 = 10x — 20x, (8) 
and the approximating formula would be 
Xn = 100 + 2x,,. (9) 


If in this case we started by setting x) = 0, we would find x, = 100, 
X, = 200, x, = 300,.. . 

Thus the sequence x,, X,,..., X,,.. ., would not approach 
the solution x = —100 of equation (8). This is not surprising, 
since after 100 sec the antelope will be 2000 paces ahead of Achilles, 
and thereafter the distance between them will continue to increase. 
It is natural, therefore, that the method of successive approxi- 
mation will not yield a solution here. 
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DIGITAL COMPUTERS 


THE reader may wonder why we had to solve equation (5) by succes- 
sive approximation instead of simply solving it directly. But, of 
course, we were interested not so much in equation (5) as in the 
method of successive approximation, which we intend to apply later 
to more complex equations. 

It should be pointed out, however, that the solution by successive 
approximation of equations such as (5) is carried out on certain 
high-speed digital computers that cannot perform division in any 
other way. Such a digital computer can carry out the three funda- 
mental arithmetical operations—addition, subtraction, and multipli- 
cation. It can also divide by numbers of the form 2”. How are such 
machines to divide by any number? 

The division of the number 5 by the number a is the process of 
finding the solution of the equation ax = b. Since the machine can 
multiply and divide by powers of 2, we may suppose that} <a <1. 
If this is not so, we can either divide or multiply both sides of the 
equation ax = b by an appropriate power of 2. We now rewrite 
the equation in the form 


x=(l—ax+b. (10) 


As our first approximation for x we take x, = 6. Suppose our 
error is «, (positive or negative), that is, suppose x, + «, = b/a. Then 
from (10) we obtain 


x +4 = (1 — a(x, + %) +5 
= (1 —a)x, + b+ (1 — ay. (11) 


Since a lies between 4 and 1, we have 
0O<l—a<p. 


Because of this, the summand (1 — a)«, on the right side is at most 
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half as large as «,. So we discard it and obtain 
xX, +o, = (1 — ax, + BD. 


The number x, = (1 — a)x, + bis our next approximation to x. 

Let us denote by «, the error in our approximation X,, that is, 
we suppose X_ + % = b/a. We know that «, is at most half as big 
as a,. From equation (10) we now obtain 


Xe + a = (1 — a)x, + b+ (1 — ajay. 
Rejecting the summand (1 — a)a, we obtain the approximation 
Xo + %& ~ (1 — a)x, + 5. 
We may therefore take the next approximation to be 
xX, = (1 — a)x, + OB. 
Arguing in the same way, we find that the next approximation is 


x, = (1 — ax, + 3B, 


and so on. Each of the numbers x,, x.,. . ., X,,. . ., 1S obtained 
successively from the formula 
Xn = dd a a)X», + b, (12) 


and our argument shows that the error of each number is at most 
half the error of its predecessor. Thus the numbers approach 
indefinitely close to b/a. This formula needs only the operations of 
addition, multiplication, and subtraction, however, and therefore 
can be used by a digital computer. 

As a matter of fact, the method we have described for division is 
based on the formula for the sum of an infinite geometric series. 
We may write the fraction b/a in the form 

b b 
a i—(l—a) 
But according to the formula, 


b 
es eg = — ay 
=e b+ bi —a)+ 01 —aj+... 
+b(1—a)”t+... (13) 
Let us denote by x,, the sum of the first » terms of this series: 


x, =b+bi—a)+...+ 51 —a)™. 
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It is clear that 


Xn = b+ D1 —at+.. . + b — a)” 
=b+(l—a[b+bd(l—a)+...4+b11—a™ J 
= b + (1 — a)x,,. 


This formula coincides with (12). Also x, = 6 coincides with our 
choice of a first approximation. Thus our approximation x, for 
the value of b/a amounts to replacing an infinite sum (13) by the 
sum of its first m terms. As n increases, this finite sum approaches 
indefinitely close to the infinite sum. We have already seen that the 
difference between x,,,, and b/a is at most half the difference between 
x, and b/a, and therefore the error of x,,,, is at most 1/2"[(b/a) — b], 
which tends to zero as » increases indefinitely. 


11 
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EXTRACTING SQUARE ROOTS 
BY SUCCESSIVE 
APPROXIMATION 


WE now show how we can use the method of successive approxima- 
tion to extract square roots. A method for doing this is taught in 
school, and it allows us to determine the successive decimal places 
of the root, one by one. This method can be regarded as one method 
of successive approximation to the solution. It is rather complicated, 
however, and pupils are apt to apply it mechanically, without 
understanding the underlying idea. We shall describe a different 
method, which was used in Babylon hundreds of years B.c. It was 
also used by the mathematician Hero, of Alexandria. Later, this 
method fell into disuse, but it is now used for extracting square roots 
on certain digital computers. Suppose we are to take the square root 
of 28. We first choose an approximate value for the root, say 
x, = 5. We denote the error by «,, so that 728 =5+4,. To 
find the value of «,, we square both sides of this equation. We find 
that 
28 = 25 + 10a, + «%, 


that is, 

a? + 10a,-—3=0. (14) 
We have thus obtained a quadratic equation for «,. If we try to 
solve this equation exactly, we obtain the roots a, = —5 + 1/28. 


Thus to determine «, exactly, we must first find 1/28. We seem to be 
in a vicious circle: to find 1/28 we need «,, and to find «, we need 
/ 28. 

The following consideration allows us to escape: the error «, 
of the approximate solution x, = 5 is not large (it is surely less than 
1). So of will be still smaller. Let us therefore find an approximate 
value for a, by disregarding the small term «{ in equation (14). We 
obtain an approximate equation for «,: 10%, — 3 ~ 0, giving us the 
approximate value «, ~ 0.3. 
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We have thus found an approximate value of the correction 
term a. Since 1/28 = 5 + a, our second approximation x, for 
4/28 will be 

Xz, =5+0.3 = 5.3. 


To find an even more accurate approximation to +/28 we repeat 

the process. We denote the error in the solution x, = 5.3 by a. 

That is, 1/28 = x, + a. We square both sides of this equation 

and reject the small term «3. We find that 28 ~ x2 + 2x,«,, and 

therefore that 

28 — xé 
2X 


Xk, ; 
This means that our third approximation to 1/28 is given by the 
formula 

28— x 284+ x3 


2X 2X2 


X3 = Xo + 


Since x, = 5.3, we find that x, = 5.2915. ... In the same way, 
starting from the approximate value x3; = 5.2915, we find a further 
approximation x, to the solution, given by 


_ 28 + x3 
4 = Ox, 


In general, if we already have an approximate value x, for +/28, 
our next approximation will be 


= 5.2915... 


28 + x2 
2x, 


Xn = (15) 
Each successive step brings us closer to the exact value. We may 
cut the sequence short whenever the difference between x,,,, and x,, 
becomes less than the margin of error we can tolerate. If we are 
computing 1/28 to an accuracy of 0.0001, for example, it is enough 
to calculate four approximations and to take 1/28 = 5.2915. For 
both x, and x, are of the form 5.2915. ... 

We may likewise extract the square root of any positive number. 
Thus to find 4/a we choose a first approximation x, and then com- 
pute the successive approximations according to the formula 


a+ x 
Xn+1 —_ 2x, (16) 
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Formula (16) may be derived by a somewhat different argument 
from the one we used in calculating 1/28. Suppose we have already 
found the nth approximation x, to »/a. Since »/a = +/[x,(a/x,)], 
/a is the geometric mean of x, and a/x,. As an approximation to 
this geometric mean we take the arithmetic mean of the numbers 
by setting 


1 a x2 +a 
ce” («. . 7 De, 


This is precisely formula (16). 

Thus the method described earlier is equivalent to approximating 
at each stage the geometric mean of the numbers x,, and a/x,, by their 
arithmetic mean. 

We shall now determine whether this method of successively 
approximating a square root always works, that is, whether things 
will always turn out as they did when Achilles raced the tortoise and 
never as when he raced the antelope. In the first case mathematicians 
say that the sequence of approximations converges, and in the second 
that it diverges. We shall show that no complications arise in our 
process for extracting square roots—that the sequence of approxi- 
mations always converges. 

To do so we compare the errors «, = \/a—x, and o,,, 
/a — Xn4, Of two successive approximations. By formula (16) the 
error «,,, can be written 


Ont = VE — Xn = Va SS 
But 
xe fact 2XnVa +a= (x, ve Vay = a 


and therefore 
oO 
Cnt = — ay (17) 
2X. 
We are considering only positive approximations x, to ./a. We 
can therefore deduce from equation (17) that all the errors , a, 
+) Opps +» are negative. In other words, every approximation 
Xn, Starting with x9, is an approximation from above. This is true 
because the arithmetic mean of unequal numbers is always greater 
than their geometric mean. Of course, the first approximation may 
be either smaller or larger than the correct value. 
Using formula (17) we may easily prove that the absolute values of 
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the successive errors are at least halved at each step. We can 
write (17) as 


2X. 2X, 2 12x 
Thus 
l a 
Je nsa] = F — 3x. | lol: (18) 
But since x, > 0 
1 ~<VYa il 
2°” 2x, =) 


On the other hand, we have shown above that for n > 2 we have 
x, > Va, and therefore 


1 +a 

5 2x, > 0. 
From this we deduce the inequality 

1 +~/a l 


Comparing (18) with (19) we see that 
| en < da,,|. 


This proves our assertion. It follows that after the second approxi- 
mation we have at least quartered our original error, after the third 
stage divided it by at least 8, and so on. It is clear that as n increases 
the absolute value of the error «, = +/a — x,, decreases steadily to 
zero. But this means that x, tends to +/a as n increases. 

Let us consider now the effect of our first choice x, on the sequence 
of approximations. We first note that our choice makes no difference 
in the end result, for we have already proved that whatever our 
initial (positive) approximation x,, the successive errors %, ag, 
. 5 +) &y . .., Of later approximations tend to zero. Thus if we 
are told the accuracy within which we have to calculate +/a, we can 
always reach this degree of accuracy after a sufficient number of 
steps. Even if we make a bad first choice, we finally attain the correct 
value within the specified margin of error. After ten steps the abso- 
lute value of our error will have decreased by a factor of over 
1000 (21° = 1024 ~ 1000), and after forty steps by at least a 
thousand billion (10!?). Thus if we start with the guess of a million 
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(10°) for 4/2, then «, ~ 108, and therefore |a4o| < 10-6 In other 
words, our initial error was close to a million and our final error to a 
millionth. 

Nevertheless, the choice of a first approximation does have an 
effect on the number of approximations we need to calculate to 
reach a certain accuracy. If we make a bad first choice, we shall have 
a long wait before x,, gets close to the true value. A good first choice 
speeds up the process considerably. We therefore often take the 
first approximation x, from a table of square roots and use the 
formula 
a+ x? 

2X4 


2 = (20) 
merely to obtain a better approximation. 

Such a procedure is especially valuable because the speed with 
which the approximations improve increases greatly when x,, is 
close to »/a. In deducing the inequality 


|o.+1| = 3|%,| 


we replaced a factor |4 — (+/a/2x,)| in (18) by 4. But if x,, is close 
to 4/a, then the value 4 — (4/a/2x,,) is very small, and therefore 


| Cnr = 


is considerably smaller than 4|«,,|. 

Let us make this more precise. We consider together with the 
absolute error |«,,| = |./a — x,| the relative error B,, of the approxi- 
mating value x, (the ratio of the absolute error |«,,| to the exact value 
of ./a). This error is given by the formula 


_ [ef Ki 
pe ga LS 


From (17) we find this formula for the value of £,,,,: 
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Thus the relative errors 8, of the approximate values satisfy the 
inequality 
Br 


Bra < Pe (21) 


For instance, if the relative error of the approximation x, is 0.01, 
then for x,,,, it will not be greater than 0.00005, and for x,4¢ it will 
not be greater than 0.00000000013. We see that the accuracy of the 
approximations increases more and more rapidly. We can show 
that every successive approximation (once we are sufficiently close 
to »/a) approximately doubles the number of correct decimal places. 


EXAMPLE. Calculate 1/238 to an accuracy of 0.00001. 


Using a four-place table, we find the value 15.43 for 4/238. 
Let us take this value for x, and find x, by the formula 


_ 15.432 + 238 


Xo = 30.86. =: 15.40725 ... 


What is the degree of accuracy of this estimate? Since the error in 
the value 15.43 is not greater than 0.01, we may safely take «, = 0.01, 
and therefore 


0.01 
By ~) 15.43 < 0.001. 
But then 
0.0012 
Bo < 5) = 0.0000005. 


This means that the absolute error of our approximation x, is not 
greater than 15.43(0.0000005) < 0.00001. In other words, all five 
decimal places in the value 15.42725 for +/238 are correct. 

If we wanted to calculate the square root to fourteen decimal 
places, a third approximation would be enough. Such a degree of 
accuracy is never needed in practice, however. 

Let us conclude this chapter by noting a special characteristic of 
our method of successive approximation. In the ordinary method of 
taking a square root, any mistake, at any stage, completely invali- 
dates all subsequent calculations. This is not true of our method of 
successive approximation. Suppose we make a mistake, obtaining 
y, Where we should have obtained x,. Then all our subsequent 
work may be regarded as the process of obtaining +/a from the 
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initial value y,. But we saw above that whatever our first choice we 
will always ultimately get as close as we like to »/a. Thus the error 
we have committed will tend to zero, and its only effect will be to 
force us to calculate a number of extra approximations. 

Because of this property, we may calculate our early approxima- 
tions to a smaller number of decimal places, and use the required 
number only in the later ones. This saves us some unnecessary 
calculation. 
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EXTRACTING KTH ROOTS BY 
SUCCESSIVE APPROXIMATION 


THE method described in the last section for extracting square roots 
can be applied to the extraction of kth roots for any positive integer 
k. We shall need the following formula: 


(x + a)P=x* + kx*® le +...,, (22) 


where the dots replace terms containing higher powers of a: «?, a, 
and so on. This is part of Newton’s binomial theorem, but we do 
not assume that the reader is acquainted with it. 

Let us prove formula (22). It is well known from high school 
mathematics that 


(x + a«)® = x? + 2xa + &?, 
(x + a)? = x9 + 3x20 + 3x0? + a3. 
We can rewrite these formulas in the following form: 
(x + ao)? = x*+4+ 2xa+...,, (23) 
(x + oF = x3 4+ 3X04... (24) 


Thus we have proved formula (22) for k = 2 and k = 3. Let us 
now multiply both sides of (24) by x + «. We find that 


(x + a4 = (08 + 3x20 +... x + x). 


If we expand the right side, we obtain one summand x* containing 
no power of «, and two summands 3x%« and x°« containing « to 
the first power. All the other summands will contain « to the second 
or a higher power. We may therefore write 


(x + «)t = x44 3x80 + Pat... = xtt4eat..., (25) 


where, as in (22), the dots denote terms containing higher powers of «. 
We have thus proved formula (22) for k = 4 as well as fork = 2 
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and k = 3. In the same way we may prove from formula (25) that 
(x + x)P = x54 5x?9+... (26) 


It is clear that the same process of proof establishes (22) for any 
positive integer k. 

We now return to the problem of extracting kth roots. Suppose 
we already have some approximation x, to /a. We denote, as 
usual, the error in this approximation by a, so that x; + a, = v/a. 
Then (x, + «,)* =a. But by formula (22) this equation can be 
written 

xk + kxkla,4+...=4, 


where the dots again replace terms with powers of «, higher than the 
first. 

If our approximation x, is close to \/a, then the error «, is small, 
and therefore we may neglect terms containing high powers of this 
error. We thus obtain an approximate equality 


xk + kxkla, & a. 


From this equality it follows that 


a — x* 
ay RY k-1 3 
kx; 


and therefore we take as our second approximation to +/a 


a—xF a+(k—1)x 
X2=xXy+ re aa = a 


In the same way, starting from x,, we find our next approximation 


a+ (k — 1)xé 


xX.= 
. kxk-1 


In general, if we have found an approximation x, to ~/a, then we 
take as our next approximation 


+ (k — 1)xt 
yet = a 27) 


As in the case of square roots, it may be proved that this process 
will converge to +/a whatever our choice of x, (so long as it is 
positive). In other words, for any positive number x, the sequence 
DO, 2 eae mer GO where for eachn, x,,,, iS calculated from x,, 
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by formula (27), will tend to v/a. To calculate ¥/a to within a 
certain error, we continue the process of approximation until x, 
and x,,,, coincide to the appropriate number of decimal places. 


EXAMPLE. Calculate ~/970 to within 0.001. 
With & = 3 our approximating formula (27) takes the form 


a+ 2x3 
Xn = 3x2 
n 


(28) 


In our case a = 970. We choose x, = 10. It follows from (28) that 
970 + 2-108 2970 


= T0300 = 77 
970 +2-°9.9% 2910.60 
Ng = 58 S043 
We see that x, and x, coincide to within the specified error. Thus 
4/970 = 9.899 


to within 0.001. 
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THE DERIVATIVE OF A 
POLYNOMIAL 


THE extraction of the Ath root of a number a may be regarded as a 
process of solution of the equation 


x*—~a=0. 


The problem is thus a special case of the following problem: given 
an algebraic equation, that is, an equation of the form 


Ax” +ax*t+...+a,=0 


to find an approximate solution. 

In the next section we describe Newton’s method of approxi- 
mately solving such equations, which is a direct generalization of the 
method we described for extracting Ath roots. We shall start by 
introducing the concept of a derivative, one of the central concepts of 
higher mathematics. For the time being we define the derivative only 
for polynomials. 

Let 


f(x) = agx® + ax*14+...+ 4, 


be any polynomial function. We consider the polynomial f(x + «), 
that is, the expression 


A(x + a)* + a(x + «P14 ...4 a. (29) 


If we remove the parentheses in (29), then some of the resulting 
terms will not contain «, some will have « appearing to the first 
power, some to the second, some to the third, and so on. We shall 
group the terms together to obtain an expression 


fle + a) =f +e + fACe2 t+... + AQ). (30) 


Since the polynomial f is of degree k, the highest power of « that 
appears in (30) is precisely k. It is clear that fo(x), f(x), ..., 
f,f{x) are all polynomials in x. 
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EXAMPLE. Suppose 
K(x) = 2% — 3x27 + 6x — 1. 
Then 


Sx + a) = 2x + «8 — 3x + a)? + 6(x + «) — 1 
= 2(x8 + 3x?a + 3x02 + «3) 
— 3(x? + 2xa + a”) + 6(x + a) — 1 
= (2x3 — 3x? + 6x — 1) + (6x? — 6x 4+ 6)a 
+ (6x — 3)a? + 203. 
So in this case 
So(x) = 2x8 — 3x* + 6x — 1, 
fi(x) = 6x? — 6x + 6, 
f(x) = 6x — 3, 
I(x) = 2. 
We see that the term f(x) coincides with f(x). This is not accidental. 
Equation (30) is an identity in x and «, and on putting « = 0, as we 
may, we find that f(x) = f(x). 

We now concentrate our attention on the next term, f,(x)«. The 
coefficient of «, that is, the polynomial /{(x), is called the derivative 
of the polynomial f(x). Thus we have shown that the derivative of 
2x3 — 3x? + 6x — 1 is 6x® — 6x + 6. The derivative of the poly- 
nomial f(x) is usually written /"(x). 

Thus the derivative f’(x) of a polynomial f(x) is defined as the 
coefficient of « in the expansion by powers of « of the polynomial 


Six + a). 


Using our new notation, we can rewrite (30) in the form 


fet od =fOtfWat..., (31) 


where dots replace terms containing a”, . . ., a. 


2(x + «)® — 3(x + «)? + 6(x + a) — 1 
= 2x3 — 3x27 + 6x — 14+ (6x7 — 6x + 6)a+... 


We have introduced the concept of the derivative f’(x) of a general 
polynomial function /(x). We now proceed to calculate it knowing /. 
We consider the polynomial 


f(x + a) = ag(x + a)* a(x t att... + ay a(x + 4) + a. 


For example, 
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Replacing each term by its expansion (22), we find that 
Kx + a) = a(x” + kx* 41a +... 2) 
+ a,[x®-? + (kK — 1x ot +. J 4... 
+ a;,_4(% + %) + a, 
— Ayx* + a,x*-1 + eee -+ a; 
+ a[kayx® 1 + (k — Dax®*+...4+a,,) +... 
Let us compare this with the equation (31): 
fx+ 0 =fM+ efx) +... 


We obtain the following result: 
The derivative of the polynomial 


JX) = agx® + ax t+... +4,4% +4, (32) 
is the polynomial 
f(x) = kagx* + (k — Iayx* 2 4+...4 0%). (33) 
For example, the derivative of the polynomial 


(x) = 6x7 + 8x8 — 3x2 - 1 


SCX) = 42x8 + 24x? — 6x. 
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NEWTON’S METHOD FOR THE 
APPROXIMATE SOLUTION OF 
ALGEBRAIC EQUATIONS 


WE now consider the problem of approximating a root of an algebraic 
equation. Suppose we are given the equation 


ax” +ax®t+... +a, = 9. (34) 


We assume that by some method we have found an approximate 
value x, of a root of this equation, and we are to show how to find 
a better approximation. We denote by «, the error in the value x, 
for the root, so that the root is x, + «,. We then have the equation 


A(X + %)* + a(x, + a)ot+...+4a,=0. (35) 


In other words, 
es + a) == 0, 


where f(x) is the polynomial 
Ayx* +ax*st+...+ a. 
But according to (31) we have 


fOr + a) = fy + af'Cyd +. 5 


where the dots, as usual, replace terms containing «?,.. ., af. 
Thus for the determination of «, we have the equation 
fey + %) = fl) + af Ca) +... = 0. (36) 


If our initial approximation x, was good enough, then «, is small, 
and the sum of the missing terms in (36) will be small in com- 
parison with «,. If we neglect these terms, we obtain the approxima- 


tion 
fey + uf") » 0 (37) 
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for «,. From this approximate equation we deduce that 


(%) 
Thus we may obtain a better approximation x, to the root of our 
equation by means of the formula 


mom Fh. (39) 


We can now go a step further and obtain an even better approxi- 
mation by taking 
f(z) 


ica iC) 


In general, suppose we have found an nth approximation x, to the 
required root. Then we obtain a better approximation x,,, by means 
of the formula 


fn) 


Xnti = Xn ~ f(x y 
n 


(40) 


This formula may be expanded as 


Aayxk + axk1 +... +a, 4x, + a, 41 
Xnty = Xn = k k-1 k-1 k—2 : ( ) 
AXn + ( JayXq be tb Oy 
If we are required to calculate a root to within a certain accuracy, 
we need only carry out this process until x, and x,,, coincide to 
the appropriate number of decimal places. We will then have 


attained our solution. 
This method of solving equations was developed by Newton. 


EXAMPLE. Use Newton’s method to find a root of the equation 
x8 — 3x —5=0 


to within 0.001, taking as a first approximation x, = 3. 
Since the derivative of the polynomial 


K(x) = 8B -—3x-—5 
is the polynomial 
f'(&) = 3x7 — 3, 
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formula (40) now takes the form 


Xnti = Xn — 3x2 — 3 
Therefore 
27-—9—5 13 
14.89 — 7.38 — 5 
12.088 — 6.885 — 5 
4 = 2.295 — ~ 15801—3 = 2.295 — 0.016 = 2.219, 


11.837 ~— 6.807 — 5 


We see that x, = x, to within 0.001, and therefore a root of the 
equation x* — 3x — 5 = 0 lies within 0.001 of 2.279. 

The method we described in Chapter 6 for the approximate 
calculation of kth roots is a special case of Newton’s method. As 
we have already pointed out, a process of determining ~/a is merely 
a process for solving the algebraic equation 


x*§—~a=0. 


Now the derivative of the polynomial x* — a is kx*—1, and when 
formula (40) is applied to this problem, we see that 


xF—~q a+(k—1)xt 


But this is precisely the formula (27) we used for calculating the 
successive approximations to ~7/a. 

We note the following substantial difference between the process 
of solving the equation x* — a = 0 and the process of solving a 
general algebraic equation 


Ayx* + a,x*-1 + o 2 + a, = 0. 


For the equation x* — a = 0 our choice of a first approximation 
x, is immaterial. Whatever value we choose for it, we will ultimately 
come as close as we like to ~/a. This is by no means so when we 
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try to find a root of equation (34). Here some initial choices will 
lead to one root, some to another, and certain initial choices will 
not lead to any root; that is, the sequence x), Xo, ~~. ., Xny- + +5 
will not tend to any definite value. In other words, the sequence will 
diverge. It is true, however, that if the sequence x,, X2,. . ., Xn 

. ., calculated according to (40), does tend to a limit, then this 
limit will be a root of the equation f(x) = 0. 
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THE GEOMETRIC 
INTERPRETATION OF THE 
DERIVATIVE 


WE have given an account of Newton’s method only for polynomial 
functions. To generalize the method to a wider class of functions, 
we shall extend the concept of the derivative by clarifying the geo- 
metric significance of the derivative of a function. 


Let us consider the graph of the polynomial 
yproax®+ax*t+...4+4 


and choose two points M and N on it. Suppose the abscissa at M is 
x and at Nis x + a. Then the ordinates at M and N are given by 


f(x) = ayx*® +ax*1+...4+4, 
and 


f(x + a) = a(x + a)’ +a(x+a)Pt+...4 4. 


Let us draw the secant through M and N, and calculate its slope k. 
This is defined as the tangent of the angle MN makes with the x-axis. 
If MN makes an angle of 60° with the x-axis, for example, its slope 
is \/3. We see from Fig. 2 that 


k=tany= =e 
But the length of MT is equal to the difference between the abscissas 
of M and N, so that 
MT =(x+a)—-x=«&. 
Also TN is equal to the difference between their ordinates: 
TN = f(x + «) — f(x). 
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Thus 
fle + a) — fe) 


tan yp = Mr 7 


But by formula (31) 
f(xt+a=f~pptofWt+... 


Fig. 2 
where the dots stand for terms in «?, « . Thus 
af(x) +... : 
patil PG) 4. 


where this time the dots stand for terms in «, a, 
Thus the slope of the secant MN is given by the formula 


Kec = tan wy = f"(x) Bes. kad (42) 


We now make « smaller and smaller. As we do so the secant MN 
will turn about M (which remains fixed, of course). In the limit as « 
tends to zero, the secant will swing into coincidence with the tangent 
to the curve y = f(x) at the point - In Fig. 3 we show the positions 
of the secants for the values 1, 4, + of «, and also the tangent at M. 

But as « tends to zero, the sum of all the terms in (42) which are 
represented by dots also tends to zero, since they all have a factor of 
a. Thus the slope of the tangent to the graph of the curve y = f(x) 
at the point with abscissa x is given by the formula 


Ktan = f(x). (43) 
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Note that the graph of a polynomial does have a tangent at every 
point. 


Xt Xx 


O x Fa 
Xtg Xts 


Fig. 3 


EXAMPLE. Find the angle between the x-axis and the tangent to the 
graph of the equation 


y= X®—4°45x4+1= f(x) 
at the point with abscissa x = 2. 
Since f') = 3x? — 8x + 5, 


we see that f’(2) = 1. Thus tang = 1, so that g, the limiting value 
of y, is equal to 45°. 
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THE GEOMETRIC 
INTERPRETATION OF 
NEWTON’S METHOD 


WE are now in a position to make clear the geometric interpretation 
of Newton’s method for the approximate solution of algebraic 
equations. Suppose we are required to find a root of the equation 
(x) = 0, where f(x) is some polynomial. Geometrically, this is the 
problem of finding the points of intersection of the graph of the 
function y = f(x) with the x-axis, that is, points at which y = 0. 

Let us suppose that we already have an approximate value x, of a 
root of this equation. Let N be the point on the graph of the curve 
y = f(x) whose abscissa is x,, and suppose the tangent to the curve 
at N meets the x-axis at 7. If our first choice x, was good, T will be 
closer to our root than M, the point with abscissa x, from which we 
started (see Fig. 4). 


y 


Fig. 4 


To find the abscissa x, of the point 7, let us consider the triangle 
TMN. The length of the vertical side MN is precisely the value of 
the function y = f(x) at the point x,; that is, MN = f(x,). The length 
of the horizontal side TM is x, — x3. It follows that the tangent of 
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the angle y, that TN makes with the x-axis is given by the expression 


S(%) 
t =, 
an g, ar (44) 
It follows from (44) that 
= Ixy) 
Ag = X= tan 01 (45) 


But tan y, is the slope of the tangent to the curve y = f(x) at the 
point with abscissa x,. Thus, from the geometric interpretation of 
the derivative we see that tan y, = f’(x,). 


Y 


Fig. 5 


So we may rewrite formula (45) as 


aan fis 


We have thus found our second approximation for the required 
root. We now repeat the whole process, drawing a tangent to the 
curve at the point with abscissa x,, and finding that it meets the 
x-axis at the point with abscissa x3, where 


See dae, at fi (%2) 
: *f'2) 

In general, if we have already found an approximation x,, then to 
find the next approximation x,,,, we draw the tangent to the curve 
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at the point with abscissa x,. The abscissa of the point of inter- 
section of this line with the x-axis is the required next approximation 
Xn41 It is given by es 
x 


Xnt1 = Xn ~ f(x ) 
n 


or, equivalently, 
_ Ln) 
(46) 


XxX =X 
n+1 n ~ tan Dn 


where ¢,, is the angle between the x-axis and the tangent to y = f(x) 
at the point with abscissa x,. This formula coincides with formula 
(40) for Newton’s method. We have thus discovered the geometric 
significance of Newton’s method. It amounts to our approximating 
the arc of the curve y = /(x) between N and the required root by its 
tangent at N. We may therefore call Newton’s method the method of 
tangents. 

Figure 5 shows how the successive approximations x,, X2,.. ., 
Xny+ + +, Obtained by Newton’s method, approach the point € at 
which the curve y = f(x) cuts the x-axis. 

Note: In obtaining (45), and similarly (46), we assumed the graph 
of y = f(x) was of the form shown in Fig. 4 (for example, we assumed 
(x1) was positive and x, > x,). Even if the graph is of a different 
form, however (see for example Fig. 7, p. 41), the formula (45) for 
the abscissa of the point of intersection of the tangent and the x-axis 
will hold (and (46) likewise). But it should be noted that x,,, may 
in some cases be further from é than x, is (see Fig. 7a, with x, = a). 
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THE DERIVATIVE OF MORE 
GENERAL FUNCTIONS 


Our geometric interpretation of Newton’s method of tangents had 
nothing to do with the assumption that f(x) was a polynomial. We 
can, in fact, extend the method immediately to the solution of any 
equation f(x) = 0, provided the graph of the function y = f(x) has a 
tangent at every point. To find a solution of the equation, we choose 
an approximate root x,. At the point of the curve with abscissa x, 
we draw a tangent, and denote by x, the point at which it meets the 
x-axis. If the tangent is horizontal, of course, we cannot do this, but 
that merely means our first approximation was not good enough, 
and we have to choose a closer one. Having obtained x,, we again 
draw the tangent to the curve y = /(x), at the point with abscissa x. 
Continuing in this way, we obtain a sequence of approximations 
X15; Xq,- + +» Xpn>- . -- where we may establish, just as when we assumed 
J(x) was a polynomial, that 

fn) ma 


x =X, — 
n+1 n tan Dn 


where tan ,, is the slope of the tangent to the curve y = f(x) at the 
point with abscissa ~x,,. 

Formula (47) is not enough for calculations, since we do not yet 
know how to compute tang,. We therefore need a method of 
calculating the slopes of the tangents to the graphs of arbitrary 
functions y = f(x), and not only of polynomials. First, let us find 
the slope of a secant. Let M be a point on the graph of the curve 
y = f(x) and MN a secant through it. Arguing as we did for poly- 
nomials, we find that the slope of this secant is given by the formula 


fis + #) = fa) a 


Ksec = tan y= 


where x is the abscissa of the point M and x + « of N. If we de- 
crease «, this secant will turn about M and tend towards the position 
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of the tangent to the curve at M. We may therefore write 


fl + a) ~ fx) 
XK 


Ktan = tang = lim (49) 


a—>0 
We call the limit on the right side the derivative of the function f(x) at 
the point x, and denote it by f’(x). The derived function (or simply 
the derivative) of the function fis the function f’ whose value at the 
point x is f’(x), that is, 


: x = 
f(x) = jim FO —S), (50) 
a—+0 

Of course, this function is only defined where the limit exists, but it 
can be proved that the limit does exist wherever the curve y = f(x) 
has a tangent. 


We may now rewrite equation (49) in the form 
Ktan = tangy = f"(x). (51) 


Thus for any function f, the derivative of fat a point is equal to the 
slope of the tangent to the curve y = f(x) at that point (provided the 
tangent exists). 

Since tan y, = f’(x,), formula (47) may be rewritten 


_ In) 
Xnt1 = Xn (Xn) 


This formula coincides with formula (40). We have thus extended 
Newton’s method to arbitrary equations f(x) = 0. 


(52) 
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THE CALCULATION OF 
DERIVATIVES 


WE saw in the last section that the slope of the tangent to a curve 
y = f(x) at the point x is given by 


/ _ Ix - a) — f(x) 
PO Re ge at 


0 


The calculation of this limit is, in general, rather difficult. The limit 
has been found in many important cases, however. In other words, 
the derived functions of the most familiar functions are known. 
We list the commonest: 


ae 7. (cot ax)’ = — _—— 
2. (xt) = kx*1, ne 
3, (a*)’ = a* Ina. 8. (log, x)’ = na 
4, (sin ax)’ = acos ax. F . ji 
. (arc sin ax)’ = —-——>" 
5. (cos ax)’ = —a sin ax. ( ) a/(1 — ax?) 
‘=——- ‘= bee, 
6. (tanax) = rE 10. (arc tan ax) laa 


In formulas 3 and 8 the logarithm is understood to be taken to base 
e = 2.71828. . . (this is the so-called natural logarithm). In formula 
2 the index k need not be a natural number, but can be any real 
number. Thus 


(Vx) = (A = ht = 
’ y 


Formulas 1-10 do not suffice for the calculation of the derived 
function of many of the well-known functions. But if a function fis 
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constructed by means of arithmetic operations from functions whose 
derived functions are known, then we can calculate the derived 
function of f, To do so, we use the following rules, which, like 
formulas 1-10, are proved in courses of higher mathematics. 

1. The derivative of the sum function of two given functions is the 
sum function of their derivatives, that is, 


(Ait+Sf) =fi t+ Se. 
2. The derivative of the product of a constant and a function is the 
product of the constant and the derivative of the function 


(af = af”. 


3. The derivative of the product function of two functions is given 


by the formula 
(hfe) =Site + hfe 


4. The derivative of the quotient of two functions is given by the 


formula LV ftf-ffft 
FA 


The rule given in Chapter 7 for calculating the derivative of a 
polynomial is a consequence of rules 1 and 2 and formulas 1 and 2 
in our list. 


EXAMPLE |. Find the derivative of the quotient 


3x27 — x + 1 

Using rule 4, we find that 

f ) as (3x? eX + 1)'2x8 +- 5) — (3x? — XxX + 1)(2x8 + 5)’ 

pea (Qx? + 5)? 
Next, using the rule for differentiating a polynomial, we find that 

(3x2 —x+ 1)’ = 6x—-—1 

and (2x3 + 5)’ = 6x*, 
and therefore 

"= (2x3 + 5)(6x — 1) — (3x% — x + 1)6x? 

p= (2x? + 5)? 
— 6x4 + 4x3 + 6x? + 30x — 5 

(2x3 + 5)? 
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EXAMPLE 2. Find the derivative of the function 


1 1 
f%) = 70 (arc-sin 3x — =): 


Solution. Using formulas 2 and 9, and rules 1 and 2, we find that 
—2 1 
f= Toye ~ 0S) = evo + 
EXAMPLE 3. Find the derivative of the function 
f(x) = 10° sin 2x. 
Using rule 3 and formulas 3 and 4, we find that 
f(x) = (107) sin 2x + 10* (sin 2x)’ 
= 10” sin 2x In 10 + 10*-+ 2 cos 2x 
= ]0* (sin 2x In 10 + 2 cos 2x). 


The rules we have given allow us to find the derivatives of a large 
number of functions. There is one further important rule—the rule 
for calculating the derivative of a composite function. 

5. If the function y = f(x) can be written in the form y = F(z), 
where z = 9(x), then its derivative is given by 


f(x) = F(@¢'(»), (53) 


where z = ¢(X). 


EXAMPLE. Find the derivatives of the function y = sin (x*). This 
function can be written in the form y = sin z, where z = x, The 
derivative of the function F(z) = sinz is F(z) =cosz, and the 
derivative of the function g(x) = x? is y’(x) = 3x*. Using formula 
(53) we find that 


[sin (x*)]’ = F’(z)g’(x) = cos z+ 3x". 
Substituting the value x? for z, we find that 
[sin (x)]’ = 3x* cos x’. 


A more detailed discussion of the concept of a derivative may be 
found in any good calculus textbook. 
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FINDING A FIRST 
APPROXIMATION 


WE now consider the selection of a first approximation to a root of 
our equation f(x) = 0. This may be done graphically by sketching 
the graph y = /(x) and seeing where it meets the x-axis. Since y = 0 
at these points, they will be roots of the equation. The accuracy of 
the first approximation will depend on the accuracy of our graph. 
If for some reason it is inconvenient to graph the function, we 
may use another method. With this method we calculate the 
values of the function for certain values of x (for example the integers 
in a certain range). If y = f(x) is continuous (if its graph has no 


(a) (b) (c) 
Fig. 6 


breaks), then between any two values a and 6 at which the function 
has opposite signs there will be a root of the equation f(x) = 0 (see 
Fig. 6a). If the graph has breaks, this may be false (Fig. 6b). We 
may take a or b as our first approximation. 

Let us note that we may miss a number of roots if we use this 
method. Thus in Fig. 6c we give an example of a function that has 
the same sign at a and 5, and yet has a root (actually two roots) in 
between. 

At any rate, suppose we now have the two points a and b at which 
the function has opposite signs. Which is it better to take as our first 
approximation? On considering Fig. 7a and b we conclude that if 
the graph of the function is concave upward between a and 5, the 
better first approximation is the one at which f is positive. If we 
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choose the other point, our second approximation might even land 
outside the interval [a, b]. On considering Fig. 7c and d, we see in 
the same way that if the graph is concave downward, we should 
choose the point at which fis negative. 

Let us note here that we are talking about the best first approxima- 
tion to a root of f(x) = 0 so far as Newton’s method is concerned. 


Bhs 
aa 


Fig. 7 


It can easily happen that when choosing between a and 5, the point a 
which is closest to the root ¢ will not yield as desirable a first 
approximation as the point b which is further away from the root €. 

This rule for choosing between a and 6 is useful when we have a 
graph of the function in front of us. If, however, we have no graph, 
we need some other method to determine whether the graph is 
concave upward or downward. To do so we must calculate the 
second derivative of the function f(x). The second derivative of a 
function f(x) is the derivative of its first derivative. If we are given the 
function 

f(x) = 8 — 4° 4+ 3x — 1, 
we find that its derivative is 
f'(%) = 3x? — 8x + 3, 
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and therefore its second derivative is 
f(x) = 6x — 8. 


In advanced mathematics it is proved that if the second derivative 
of fis positive throughout the interval [a, 5], then the graph of fis 
concave upward in this interval, and if it is negative throughout the 
interval, then fis concave downward. Using this fact, we obtain the 
following rule for deciding on a first approximation when we are 
using Newton’s method. 

Suppose that f has opposite signs at the points a and b and that 
the second derivative of f(which we assume exists) is positive through- 
out this interval. Then for our first approximation we choose that 
one of the points a, 6 at which fis positive. If, on the contrary, the 
second derivative is negative throughout the interval, we should 
choose the point at which f is negative. 

Note, finally, that f might well be neither concave upward nor 
concave downward throughout the interval [a, b}]. It might begin, 
for example, by being concave downward and then change over. 
In such a case our rule does not help, but then it is advisable to 
calculate the value of fat c = 4(a + b), and restrict our attention to 
the subinterval ([a, c] or [c, 5]) in which f changes sign. 
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WE now describe a different method, the method of chords, for 
solving equations approximately. Suppose, as in the previous 
section, that we have already found points a and b at which f has 
opposite signs. As we have said, if f is continuous, the equation 
f(x) = 0 has a root between a and b. That is, there is at least one 
point where the graph y = /(x) crosses the x-axis. To obtain an 


Fig. 8 


approximation to this point, we replace the arc of the curve y = f(x) 
lying between a and b by the chord MN (see Fig. 8), and find the 
point T where this chord meets the x-axis. 

To find the point T we consider the similar triangles MM,T and 
NN,T. It follows from their similarity that M,T/MM, = TN,/N,N. 
But it is clear from Fig. 8 that M,T =a, —a, TN, =b—4Q,, 
MM, = —f(a), and N,N = f(b), where a, is the abscissa of T. 
Thus we have 

ay — @ b — a, 


—fla) ~ fb) 


On solving this equation for a,, we find that 


1 oe) = bla) 
1 fb) — Fa) 
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This equation may be written 


or 
— @ 
a= 0M TF fay ) 


The value a, is taken as our second approximation to the root (or 
one of the roots) of f(x) = 0 lying between a and b. 

Since f has opposite sign at a and 5, the sign of f(a,) must differ 
from at least one of them. The only other possibility is that /(a,) 
= 0, but in that case we have found the exact root and there is no 
need to go further. Suppose then that fhas opposite sign at a and 
a,. Then we apply formula (54) to the segment [a, a,] and find the 
approximation 


—- a 


ay = a, — fla) Ta 5 0) 


for the required root. If fhas opposite sign at a, and 5, we apply the 
formula (55) to the segment [a,, b] and find 


— ay 


a= a1 — 0) T= flay 


Having found the value of a,, say between a and a,, we see whether 
the sign of f(a.) differs from that of f(a) or f(a,). If, say, the latter, 
we continue as before, but taking a, and a, as the end points of our 
interval (instead of a and b). In general, we shall find at the nth 
stage that there is a root of the equation f(x) = 0 between c and a, 
(where c is either some previous a; or the point a) or between a, 
and d (where d is either a previous a, or the point b). We then apply 
our original procedure, starting with the interval [c, a,] (or [a,, d]), 
to obtain a new point a,,,, inside the interval. Thus our sequence of 
approximation yields us a “nested’’ sequence of intervals, and this 
will, in general, converge to a single point, the required root. 

We now consider two useful special cases of this procedure. 
Suppose first that throughout the interval [a, b] the graph of f 
either decreases steadily and is concave upwards (as in Fig. 9a) or 
increases steadily and is concave downwards (Fig. 9b). Then we 
easily see that the left-hand end point of each interval of our nested 
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sequence is the point a, and the right-hand end point a, is given 
recursively by the formula 
a,—a 


Ani, = a, — KO) a = fay (56) 


Similarly, if the graph of f is of one of the types shown in Fig. 9c 
and d, the right-hand end point of each of the intervals of our 


Fig. 9 
nested sequence is the point 5 and the left-hand end point is given 
recursively by the formula 


b— n 
Ansty = an — fan) ry — ay (57) 


The method of chords may profitably be used in combination 
with Newton’s method. Thus we start by calculating a, by means of 
formula (55), and x, by means of the formula 


b 
x,=b- 13 (58) 
or A 
a) 
x= a f@ 


according to the end of the segment [a, b] at which the sign of f 
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coincides with that of its second derivative. As we see from 
Fig. 10a and b, the root & of the equation f(x) = 0 will lie between 
the points a, and x,. At least, this will be true so long as the graph 
is of one of the forms shown in Fig. 9. We start again, now, with 
the points a, and x,, and use them to obtain a further pair, a, Xo. 
Working in this way, we find two sequences a, dy, . «© +5 Any + + 5 
and X,, Xo. . +X). + +, converging to the required root from 
opposite sides. This method has the advantage that we know at each 
stage the limits of accuracy of our calculated value of the root. 


Fig. 10 


We know at stage n that the root lies between a,, and x,,, and there- 
fore that a value 3|a, + x,| cannot be more than 4a, — x,]| off. 


EXAMPLE. Use the combined method to calculate a root of the 
equation 
x —sinx —0.5 =0 
to within 0.001. 
We compile a table of values of the continuous function 


f(x) = x — sinx — 0.5 


We see from this table that there is a root of the equation between 
1 and 2. Using formulas 1, 2, and 4 of Chapter 12, we find that 


f(x) = 1 — cos x. 
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Thus in our case Newton’s formula assumes the form 


x, — sinx, — 0.5 


1 — cos x, (69) 


Xnt1 = Xn 
To determine whether we should take 1 or 2 for xy, we find the second 
derivative of f. Using formulas | and 5 of Chapter 12, we find that 
f"(x) = sin x. But the function sin x is positive throughout this 
interval, since 1 and 2 both lie within the interval [0, 7], and sin x 
is positive on this larger interval. So, by the rule stated earlier, we 
must take x) = 2, since fas well as f” is positive there. By (59) we 
find that 


2 —sin2 — 0.5 2 — 0.909 — 0.5 
aca TP -7 yao | +0416 pee 
On the other hand, we find from (55) that 
2—1 
a, = 1 — (—0.341) — rns = 1.306. 


0.591 — (—0.341) 
Applying formulas (59) and (55) to the segment [a,, x,], we find 
hat 


1.583 — 1.000 — 0.5 


Xo = 1.583 — — }+0012. = 1,501 
and 
1.583 — 1.366 
a, = 1.366 + 0.113 0.083 4+ 0.113 = 1.49]. 
Continuing, we find that 
xX, = 1.498, 
a, = 1.498. 


Thus a root of our equation is, to an accuracy of 0.001, equal to 
1.498. 
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WE now consider a general method for the approximate solution of 
equations. We shall see that both Newton’s method and the method 
of chords are special cases of this method, which is called the method 
of iteration, or the method of successive approximation. We start by 
examining this method as applied to a concrete example. 


EXAMPLE. Find a root of the equation 
10x — 1—cosx =0 (60) 


accurate to within 0.001. 
We rewrite the equation (60) 
I + cos x 

2 aT (61) 
We now choose some initial approximation, say x, = 0, and substi- 
tute it for x in the right side, but not the left side, of equation (61). 
This gives an equation for x whose value we take as our second 
approximation to a root; that is, we take 


_il+cosd _ 


i0 0.2. 


Xo 


Substituting x, in the same way in the right side of (61), we obtain 
our third approximation 


_ 1+cos0.2 14+ 0.98 


X3 10 AY ~ 10 = 0.198. 
Continuing, we obtain 
I + cos 0.198 
Ses OS 50108 


10 


We see that x; = x, to within 0.001. Since x, = (1 + cos x,)/10 
X3 = 0.198 is a root of equation (61) to within 0.001. 
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The method we have described for solving equation (61) is that 
of iteration. In general we may describe the method as follows: 

We write the equation f(x) = 0 in the form x = g(x). Then we 
select an initial approximation x,, substitute in the right side of the 
equation, and obtain a second approximation x, = 9(x,). In 
general, if we have an approximation x,, we obtain the next approxi- 
mation x,,, by means of the formula 


Xn = P(X). (62) 
If we find that to within the required accuracy x,,, = x, we may 
take x, for our root. 
Newton’s method is a special case of the method of iteration, 
because the method may be presented as follows. 
Suppose we are given the equation f(x) = 0. We divide both sides 
by —f’(x) and add x. We obtain 


ee it) 

LO) 
This equation is clearly equivalent to the original one, except 
possibly at those points where f’(x) = 0, where it is not defined. 
It turns out, however, that in a certain sense, which we shall not 
investigate here, this “almost never’’ matters. 

Applying the method of iteration to this equation, we find that the 
successive approximations x,,,, are given by 


_ f(%n) 
Coenen = Xn f(x) 


x= xX 


But this is precisely the formula used in Newton’s method. 

In the same way we may show that the method of chords is a 
special case of the method of iteration. We may rewrite the equation 
f(x) = 0 in the form 

x—a 
ate — 98 IO FS F@ — fia) 
or in the form 
b—x 
*= IO FG FG) 

A number of questions arise in connection with the method of 
iteration we have described: 

1. Does the sequence x,,. . ., X,,. . ., obtained by the method of 

iteration always converge to some number &? 
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2. If it does, then is € a root of the equation x = (x)? 
3. How rapidly does the sequence x,,.. ..X,,. + + tend to a 
root & of the equation x = (x)? 


The easiest question to answer is the second. Suppose the sequence 
X1,+ + +s Xpo- - «, tends to the limit €. Consider the equation 


Xnt1 = P(X,) 


giving each term of the sequence by means of its predecessor. 
As n increases, the left side tends to & and if » is continuous 
the right side tends to g(é). Thus in the limit we have € = ¢(6), 
which is to say that & is a root of the equation x = ¢(x). 

The answer to the first question is negative. As an example we may 
take the problem of Achilles and the antelope (Chapter 3), where 
the method of iteration applied to equation (8) (see equation (9)), 
leads to the divergent sequence 0, 100, 300, 500.. . . There can be no 
limit, since the successive terms increase without bound. Notice, 
however, that if we rewrite equation (8) in the form x = $(x — 100) 
and take x, = 0, then the method does lead to the correct solution 
x = —100. The first few terms are 0, —50, —75, —87.5, —93.75.... 

Another example is given by the equation 


x = 10° — 2. 
If we let x, = 1, then 


Xo = 8, x, = 10° -—2. oe 


As n increases, x, increases without bound. But if we rewrite this 
equation as 


x = log (* + 2), 


then the approximating sequence converges, and after three ap- 
proximations we find that x = 2.38 to two decimal places. 

We should therefore rephrase our first question: ‘“What sort of 
function y should we choose in order for the sequence x,,. . 
Xny+ + +, tO converge?” 

Our answer to question 2 shows that it is natural for us to restrict 
our attention to cases where @ is continuous, and this we shall do. 
Before attempting to go further into the question, we examine the 
geometric interpretation of the method of iteration. 


“9 
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THE GEOMETRIC 
INTERPRETATION OF THE 
METHOD OF ITERATION 


FINDING a root & of the equation x = ¢(x) is clearly the same as 
finding the abscissa of a point M at which the curve y = ¢(x) and the 
straight line y = x meet. Suppose we start with some initial ap- 
proximation x, to é (Fig. lla and b). Then the point M, with co- 
ordinates (x,, y(x,)) lies on the curve y = g(x). We now draw the 
horizontal line through it, to meet the straight line y = x at the point 


Fig. 11 


N,(y(x1), p(x). We denote 9(x,) by x,. Then the coordinates of N, 
are (x,, X,). We now draw the vertical line through N,. It meets the 
curve y = g(x) at the point Mg (xs, y(x2)). Continuing the process, 
we obtain the point N, on the line y = x with coordinates (x3, xs), 
where x, = ¢(X,), and then the point M, with coordinates (x3, (x3), 
and so on. If the approximating process converges then the sequence 
of points M,, M.,. . ., will converge to the point M of intersection 
of the line and the graph of y = ¢(). 

Thus the geometric interpretation of the method of iteration is 
that we move toward some point of intersection of the curve and the 
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line along a broken path whose vertices lie on the curve and the line 
alternately, and whose sides are alternately horizontal and vertical 
(Fig. 11a). 

If the curve and the line bear the sort of relationship to each other 
that is shown in Fig. 11a, then this broken line looks like a staircase 
(at least, as soon as we have come sufficiently close to M). On the 
other hand, if the curve and the line are disposed as in Fig. 11b, the 
broken line looks like a spiral. 

If the process of iteration diverges (as in the problem of Achilles 
and the antelope), the steps of our staircase or spiral will become ever 
bigger, and the points M,, M,,. . ., will recede from M instead of 
converging towards it (Fig. 12a and b). 


Fig. 12 


We see from Fig. lla and b that if the angle of inclination of the 
tangent to our curve at the point M lies between —45° and +45°, 
and if our first approximation is reasonably close to M, then the 
sequence of points M,, M,,. . ., will converge to M. If we choose 
M, too far to the right, we may “‘come under the influence” of the 
next point of intersection after M, or even more complicated things 
may happen. At any rate, in the case we are considering the slope of 
our curve at the point M (6, &) lies between —1 and 1. That is, 
| ~’(6)| <1. Ifthe angle of inclination of the tangent at M is greater 
than 45° or less than —45° (Fig. 12a and b), the points M,, M,, 
Mz, . . ., will get further and further from M. (They may not tend 
to the next point of intersection, however.) In this case either 
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y’(é) > 1 or g'(€) < —1. These two inequalities may be written 
compactly in the form 
\e’(é)| > 1. 


Thus if m has a derivative a sufficient condition for the method 
of iteration to work is that at the root & of the equation we have 
ly’) <1. Ifthe condition is satisfied, the sequence of approxima- 
tions will converge provided only the first approximation is sufficiently 
close. If the condition is not satisfied, no choice of a first approxima- 
tion, unless it is exactly the root, will lead to a sequence of approxi- 
mations converging to &, though they may converge to some other 
root of the equation. We may say intuitively that the points of 
intersection where |g’(é)| <1 pull the broken line of the sequence 
of approximations towards them, and the points where |p’(é)| > 1 
drive it away.f 

Since we do not know beforehand the exact value of &, this rule 
as it stands has no practical use. It may be proved, however, that if 
|p’(x)| < 1 throughout the interval [a, 5], then there is at most one 
root of the equation x = (x) on this interval, and if there is one, 
then any choice of a first approximation which lies in [a, 5} will lead 
to the root upon applying the method of iteration. 


EXAMPLE 1. Can the process of iteration be applied to find a root 


of the equation 
cos x + sin x, 


4 


Here we have 
cos x + sinx 


(x) = 4 
— sinx + cosx 
Thus gy’ (x) = ei ee 


+ Note that if the tangent to y = y(x) at x = € makes an angle of 45° 
or —45° with the x-axis, that is, |y’(6)| = 1, then the sequence obtained 
may either converge or diverge. To illustrate this we give two examples 
for which |9’(é)| = 1, but with one sequence converging and the other 
diverging. ue 

If we try to solve the equation x = sin x by successive approximation 
we will succeed with any initial approximation. The sequence of approxi- 
mation will converge to 0. But if we try to solve the equation x* = 0 by 
writing it in the form x = x® + x, the sequence obtained diverges for any 
first approximation except x, = 0. In both cases |¢’(é)| = 1. 
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But |sin x] < 1, |cos x] <1, 


— sinx + cosx 
4 
sin x} + |cosx 
«sina + loos 


and, by what we stated above, the equation has at most one root, 
and any choice of a first approximation will yield a sequence tending 
to it. To see that there does exist a root, notice that for large negative 
values of x we have g(x) > x, while for large positive values of x 
we have x > g(x). Thus the continuous curve y = g(x) starts off 
above the line y = x, and ends up under it, and since ¢ is continuous, 
it must cross at some point. 


so that |g’(x)| = 


1 
< 5 for all x, 


EXAMPLE 2. Can the method of iteration be applied to solve the 
equation 
x=4—- 2°? (63) 


We note first that this equation has exactly one root. Consider the 
graph of the function y = x + 2* — 4. It is clear that the graph 
increases steadily, and is negative for large negative values of x and 
positive for large positive values. Thus the graph must cross the 
x-axis (at which point there is a root of the equation), and it cannot 
cross it more than once, since the function is steadily increasing. 
Now our root must lie between 1 and 2, for 


1<4—2!=2 while 2>4-—27?=0. 


We therefore confine our attention to the segment [1, 2]. On this 
segment we have 
g(x) = —2* In 2, 


where 1 < x < 2, s0 that 2 < 27 < 4, and 
2In2 <2*In2 < 41n2. 


Using tables of natural logarithms (to base e = 2.78 . . .), we find 
that In 2 = 0.69. . .. So on the interval [1, 2] we have 


13843 2 I 2 21G seins 


and the process of iteration cannot be applied. 
To be able to apply it, we rewrite equation (63) in another form. 
We begin by writing 
27 =4—-x 


54 


GEOMETRIC INTERPRETATION OF METHOD OF ITERATION 


and then take logarithms to base 2 of both sides. We find 
x = log, (4 — x). 
For our new function @ we find that 


i 
P= an 


so that on the segment [1, 2] we have the inequality 


; 1 
POO <a5n3 = Tag <b 


The reader may easily prove this inequality. 
Thus when the equation is written in this form, the sequence of 
approximations will converge. 
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THE SPEED OF CONVERGENCE 
OF A SEQUENCE OF 
SUCCESSIVE APPROXIMATIONS 


WE now give an estimate for the speed with which a sequence of 
successive approximations converges, that is, the speed with which 
the successive errors a, = € — x, tend to zero. We shall need a 
formula known as the mean value formula. 

Consider the function y = f(x) defined on a segment [a, 6], and 
suppose it has a derivative throughout this interval. Suppose M is 


Fig. 13 


the first and N the last point of the graph of this function. Then the 
slope of the chord MN is given by 


PN 
Kenora = tany = MP 


(Fig. 13). But MP = b — a, and PN = f(b) — f(a), and therefore 
_ fe) — fla) 


k = 
chord b—a 
Suppose T is a point of the arc MN lying at the maximum possible 
} This section may be omitted on first reading. 
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distance from the chord MN. If we draw a parallel through Tto MN, 
it will be the tangent at 7, for otherwise it would cross the arc, so that 
there would be a point of the arc further than T from MN. In other 
words, the tangent at T is parallel to MN and therefore has the same 
slope. But the slope of the tangent at Tis f’(c), where c is the abscissa 
of T. We thus have the formula 


b) — 
po=-Pi®. (64) 


This is the mean value formula. We note that the point c always 
lies strictly between a and 6. The formula can also be written 


f(b) — fla) = f(b — a). (65) 


We return now to solving the equation x = (x) by the method of 
iteration, and we suppose that » has a derivative everywhere. Let & 
be the required root of the equation, and x,,..., x,,... the 
sequence obtained by applying the method of iteration to an initial 
approximation x, We then have the equations ¢ = 9(é) and 
Xniy = V(%,). It follows from them that 


nt = & — Xn = P(E) — P%n). 
But by the mean value formula 
GE) — P%n) = G'CCnME = Xn) = P(Cn) Ons 
where, for each n, c,, is a point lying between x, and €. Thus 
nt = P (Cn)on: (66) 


From equation (66) we make the following deduction: 

Let & be a root of the equation x = ¢(x) in the interval [a, 5]. If 
throughout this interval we have the inequality |y’(x)| <q < 1, and 
our first approximation x, is chosen in the interval, then for every r 
we have the inequality 


leneal <9"|o]. (67) 
It follows from (66) that 
|p| = le’(ep| ||. 
But c, lies inside the interval (see Fig. 14), and therefore 
le'(ed| <4. 
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It follows from this that 


|] < glo]. 
Similarly, we find that 


|x| = |p’ (c2)| | | < q]%q| <qial, 
and, in general, 
[oa < q”| «|. 


We have thus proved our assertion. 

Since whenever 0 <q < 1 the sequence gq, qg*,. . .,g”,. . . tends 
to zero, the succcssive errors «,,,, Will also tend to zero as n increases. 
In other words, under the conditions given above, the sequence of 


—__-}—-_+—___— 
a 3 C x b 
Fig. 14 
approximations x,,.. .,X,,. . . will always converge to é, and the 


successive errors | — x,,,| will decrease faster than the terms of the 
geometric sequence {|a,|q":n = 0,1,2,. . .}. 

We can show in the same way that if throughout the interval 
[a, b] we have the inequality 


lp’(x)| > 1, 


then the sequence obtained by a process of iteration will diverge. 
A geometric discussion was given in Chapter 16. 

The process of iteration converges especially rapidly if the deriva- 
tive of y is zero at €. In that case (assuming that the derivative of p 
is continuous, as it almost always will be), g(x) tends to zero as x 
tends to €. But since 


carer = le’(en)| | %n| 


and c,, tends to é, the rate of convergence will increase as weapproach 


We have met this phenomenon in our consideration of an iterative 


method of extracting square roots (Chapter 5). We recall that we there 
2 
replace the equation x? =a by the equation x = ore But the 
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(x? + a) 


derivative of the function g(x) = e 


1 a 
oes: | $3 => be : 
= 5% + 5 x7 is the function 


gy (x) = ; + 5 *(—1)x-? 


(see rules 1 and 2 and formula 2 in Chapter 12). Thus 


, 1 a J 
WWa)=54+5°(DopA 
So at the root x = +/a of our equation the derivative of the function @ 
is zero, and this is why the process of approximation continues to accelerate 
as we approach the solution. 

This phenomenon of speeding up as we continue the process of approxi- 
mation is typical of Newton’s method, of which our process for extracting 
square roots was a special case. We have seen that Newton’s method 
amounts to applying the method of iteration to the equation f(x) = 0 as 
rewritten in the form 


0. 


_, £2, 
mee FG) 
Here we have 
_, £2. 
p(x) = x 7@) 
But , ie 
<1 - [2] 21 -f@ver = foyer 
g(x) = 1 P al 1 FOr 
— 1 — FLOP — fmf" _ fof 
[for LOOP 


Since f(£) = 0, we also have y’(£) = 0. And, as we have said, this is 
sufficient to make the approximating process speed up as we approach €. 
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THE SOLUTION OF A SYSTEM 
OF LINEAR EQUATIONS BY 
SUCCESSIVE APPROXIMATION 


So far we have considered the solution of equations having only one 
unknown. We now consider the solution of systems of equations, 
confining our attention to systems of the first degree. Suppose we 
are given m equations of the first degree, in the m unknowns x,, 
Noise: 4. 8b es 


AyX, FF AyXg +. e+ AX m = 4, 
AaiXy + AX. +. + danXm = 5e, (68) 
BmiX1 + AmoXe t. - - + AmmXm = Om: 


Here the a,; are constants (i refers to the number of the equation, 
and j to the number of the unknown). For example, a,, is the coefhi- 
cient of x, in the second equation. Such systems of linear equations 
are met with in many applications—in constructing accurate maps 
for large portions of the earth’s surface (geodesy) or in engineering 
for estimating the strains and forces on systems of interlinked rods 
(for example, in the construction of a bridge or an airplane wing). 

Solutions of such systems by ordinary methods, such as the 
successive elimination of unknowns, is tedious. It is often easier to 
use a process of successive approximation. We shall start by giving 
an example of how such a process can be applied. 

Suppose we are given the system of equations 


10x, = 2X + X3 = 9 
x1 + IXo = X3 —_ 8 
4x, + 2Xo + 8X5 = 39 


We are to find the values of x,, x2, x3 to within 0.01. 
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We take x, to the other side of the first equation, x, in the second, 
and x; in the third, to obtain the rewritten system 


x, = 0.9 + 0.2x, — 0.1x; 
xX, = 1.6 — 0.2x, + 0.2% f. (69) 
x%3=4 —0.5x,— 0.25x, 
We choose an arbitrary set of values as our initial approximations, for 
example x{ = 0, x?) = 0, x?) = 0. We substitute these values in 


the right side of the system (69) and take the resulting values as our 
next approximations to the three unknowns. Thus we find that 


xf) = 0.9, 
x$) = 1.6, 
x) = 4, 


These new values are now inserted in the right side of (69). We 
obtain the approximations 


x? = 0.9 + 0.2-16—0.1°4 = 0.82, 
x = 1.6 —0.2°0.9 + 0.2-4 = 2.22, 
x?) = 4-—0.5°0.9 — 0.25° 1.6 = 3.15. 
In general, if we have found a set of nth approximations x™, x9, 


xy, then as our set of (n + 1)st approximations we take the values 
given by the formulas 


x2) = 0.9 + 0.2x — 0.1x%, 
xt) = 1.6 — 0.2x + 0.2x%, (70) 
xt) = 4 — 05x — 0:25x™. 

The results of our successive calculations are shown in Table 2. 


TABLE 2 


SUCCESSIVE APPROXIMATION 


We see that to within the required accuracy we have 
= 5) m= (6) x(5) — (6 
MO SO SS Se, (71) 


Setting n = 5 in (70) and taking account of (71), we see that to the 
required accuracy we have 


xt) w 0.9 + 0.2x) — 0.1.x, 
xO) we 1.6 — 0.2x0 4+ 0.2x%?, 
(5) we 4 — 0.5x{) — 0.25xf?. 


Actually these equations are exact (if we take 2 and not 1.99 for x,), 
but this is not the point. It follows that, within the required degree of 
accuracy, the numbers x{>) = 1.00, x = 2.00, xf? = 3.00, form a 
solution to our system. 

We proceed in the same way in the general case. Suppose we are 
given the system (68). We take x, from the first equation, x, from 
the second, and so on. If, as we are assuming, none of the coeffi- 
cients a,, 1s zero, the system (68) assumes the form 


5) 
x3 


Ri ies by Q9 im 

Ni et Se SSIS aaa om ms 
ay “11 @yy 
bo Qo, om 

Mat ae oO ee ede — Te Xans 
A920 Aa a 
Bin ant Ame Ams m—1 

Xm =e Seren? ce (72) 

Suppose x{?,. . ., x) are any first approximations to the un- 


knowns x,,. . ., X,,- Substituting in the right side of (72), we find a 
system of second approximations to the required roots: 


wy — Ar = ORY. _ Am 1) 
Xx} X¢ e e e Xin 9 
Qy1 Qy) Qi 
OB ys a — Ham x00), 
Ag9 = Age, Ago 
(2) — bin Qn (1) Ams m—l1 (1) 
xy? = — XH SA ge 
Qin Aiaam Amm 


In the same way, having found a system of nth approximations 
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xi, . . .. x® to our unknowns, we find the next approximations by 

means of the formulas 

x(t) on 22 a ee _ Aim y(n) 
Qy4 Ay ~ a,” 

xgtt1) = LZ} eel x”) hes = am 4n) 
Q92 Aap Azo ™? 
b a a 

Cp) ae ee on ee Le 3 
5 ee : ‘ x”) 22 x™ (73) 
mm mm Qnm 


It can be shown that this sequence of successive approximations 
converges to the (unique) system of solutions if for every k we have 
the inequality 


Ayn > ay| + | Ane tee et lay x1 as (arel tee et | 2m 
wherekK =1,...,mor if 
ae a; | 
ie <me4+ 1. 
j=l k=1 ! *E 


Roughly speaking, the diagonal elements must be very large and 
positive. This restriction may appear to be very severe. However, 
there exist methods for reducing any system of linear equations (with 
as many unknowns as equations) to one in which these conditions 
hold (see, for example, Margulis: Systems of Linear Equations, Vol. 
14 in this series). 

The remarks made in Chapter 5 hold here, too. For instance, the 
result does not depend on our choice of an initial approximation. 
Thus a mistake in the course of our calculations does not make all 
our work useless, but merely lengthens it. Moreover, if the sequence 
converges, as it will if the indicated conditions are satisfied, it 
will always converge to a solution of the system (in general there is 
not a unique solution, but under our conditions there is). 

The method we described may be modified in a number of ways. 
For example, having found an approximate value x{"+"), we find the 
approximation x{"+)) by substituting the values x{"*), xf, . . ., xf” 
for the unknowns in the right side of the second equation; then we 
find x%"+) by substituting x("+), xt), x, . . ., x in the third 
equation, and so on. A description of all the methods used for ap- 
proximating a solution for a system of linear equations could easily 
fill a booklet of this size. 
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SUCCESSIVE 
APPROXIMATIONS IN 
GEOMETRY 


WE have now described applications of the method of successive 
approximation for solving equations and systems of equations. This 
method may also be applied to certain questions of geometry. We 
shall show how it may be used to obtain a value for the circumference 
of a circle. It is well known that we may approximate a circle suc- 
cessively by an inscribed square, regular octagon, 16-gon,. . . and 
find the circumference of the circle as the limit of the perimeters of 
these polygons. We calculate each perimeter with the help of the 
previous one. 

Let us denote the side of our regular 2”-gon by 4,, and its circum- 
ference by P,. For example, A, is the side of an inscribed square, 
and therefore equals R1/2, and P, = 4R,/2. Suppose we have 
already found P,. Then clearly 

A an 

rn Qn 
Now the side A,,,, of an inscribed regular 2”+!-gon may be expressed 
in terms of the side 4, of a regular 2”-gon and the radius R by means 


of the formula 
A,? 
Ann = R | [2 -,/(' 7 7) | a 


This may be proved geometrically, but it is faster by trigonometry. 
Thus we have 


vrs 


_ 7 
A,, = 2Rsin 7 and A,,, = 2Rsin aoe 


(M = 2") 
(see sketch). Now since for any « we have 


 h 1 — cosa 
ea las ) ’ 
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we see that 


Ani =2R sin = = 2R J [=r 
-aJf2-aJ(1-ow3)] = 2) -Jle-8)) 


This formula can now be used to calculate the perimeters. Recall 
that 


consequently we have 


Pia, = 2"UR J 2 - J (4 = ts) | (75) 


The sequence of numbers P,, P;,. . ., P,,. . . tends to the length of 
the circumference (to 27R). Thus the formula (75) may be regarded 
as a formula for the calculation of 27R by the method of successive 
approximation. Using it, and taking R = 1, we may calculate the 
value of 7 to any number of decimal places. 

There is another method of approximating 7, known as the method 
of equal perimeters. We replace the regular 2”-gon by a regular 
2"*1-g0n with the same perimeter and the same center. We denote 
the length of the apothem of the regular 2"-gon by /,, and the radius 
of its circumcircle by r,. (The apothem of a regular polygon is a line 
drawn from the center and perpendicular to a side.) We denote the 
length of the apothem and radius of the regular 2”+1-gon with the 
same perimeter by /,,, and r,,,, respectively. 

Let AB (Fig. 15) be a side of a 2”-gon inscribed in a circle of radius 
r,. We join the center C of the arc AB to A and B, and then draw 
DE parallel to AB and bisecting CF. It is clear that angle DOE is 
half angle AOB. So DE is a side of a regular 2”*?-gon, inscribed in a 
circle of radius OD. Since DE = 4AB, the perimeter of this 2"*?-gon 
is the same as that of our 2”-gon. This means that r,,, = OD, 
Lia = OK. 
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We easily verify that 


ra tly 
ha = a, ae (76) 


By considering the right triangle ODC we also see that 


eS V(r nlnti)> (77) 


Formulas (76) and (77) give r,,, and /,,, in terms of r, and /,. 
As n increases, the perimeters of the polygons do not change, 
and the numbers ;, and /, tend to the same limit. This limit ts the 


C 


WV 


Fig. 15 


B 


radius of a circle whose circumference has the same length as the 
perimeters of all our polygons. If we choose our first polygon to 
have perimeter 2, then our limit circle will have perimeter 2, so that 
2 = 27R, and R = I/m. Thus 


lim beta lim jee 

n—> 00 n— © 7 

If we choose as our first polygon a square of side 3, then r, = 44/2, 

l, = 4. We thus have the following assertion: if we put rp = 44/2, 

/,=4 and calculate 7,44, Jn4, (1 = 2,3,.. .) by means of the 
formulas (76) and (77), then 


limr, =lim/, = - 


n> @ n> © 


3 
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These formulas allow us to find an approximate value for 1/7. 
To obtain I/z to within a given error e, all we need do is continue 
the process until we find an 7 for which r, and [, differ by less than 
2e. For we may easily show that 1/7 lies between r,, and /, (whatever 
the value of n), and therefore in this case within « of /,,,. 


67 


20 
CONCLUSION 


IN this book we have seen how the method of successive approxima- 
tion can be applied to a variety of problems: the drawing of plans, 
extraction of roots, solution of equations, calculation of the cir- 
cumference of a circle. The wealth of applications of the method is 
by no means exhausted by these examples. Many problems lead to 
differential equations (in which we are required to find a function 
which satisfies equations in which its derivatives enter), to integral 
equations, and to even more complex equations. One of the most 
powerful methods of approximate solution of such equations is, 
once again, the method of successive approximation (iteration). 
Of course, its application in such cases is more involved than in the 
solution of algebraic equations (the only sort we have considered). 
But we may safely say that without the use of methods of successive 
approximation we would not be able to tackle any of the impressive 
problems in physics and engineering (such as sending a rocket around 
the moon) that are constantly being solved nowadays. Not only the 
calculation of satellite orbits, but the calculations needed to start 
an atomic reactor, to study the structure of the atom, or to forecast 
the weather, all use this method. But a discussion of the applications 
of the method beyond the field of elementary mathematics would 
take us outside the scope of this book. 


68 


EXERCISES 


Here are some exercises which the reader may use to check his mastery 
of the material of this book. 


Use the method of iteration (p. 48) to solve these equations: 


Pies 2. x = (x + 1) nD Oy (are 
(x + 1)? : eens x+1 
4.x=24+ Wx. 5. x= ¥/(5 — x). 6. 4 — 3x = tanx. 


1.x*=sinx. 8 x°=sinx. area 10. x = cos x. 


1 
11. x = arc-cos = 2x=I1+ if. sinx. 13. x = + v[(log (x + 2)]. 


10 
14. x? = In (x + 1). 15. Inx =4— x’. 16. Inx =2—~x. 
17. x? = e* + 2. 18. log x = 0.1 x. 19. x = arctan (log x). 
1 
— — pt 
20. x io’ 

Use the method of chords and tangents (p. 43) to solve these equations: 
21. x — Sx +1=0. 22. x° — 9x? + 20x — 1 = 0. 
23. x8 — 3x? — 3x + 10 = 0. 24. x° + 5x+1=0. 

25. sinx+x=1. 26. x2 — 10 logx — 3 = 0. 


+ In some of these examples the reader must first rewrite the equations 
in the form x = ¢(*). 
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